Automated Speech Recognition Technology for Dialogue Interaction with Non-Native Interlocutors
نویسندگان
چکیده
Dialogue interaction with remote interlocutors is a difficult application area for speech recognition technology because of the limited duration of acoustic context available for adaptation, the narrow-band and compressed signal encoding used in telecommunications, high variability of spontaneous speech and the processing time constraints. It is even more difficult in the case of interacting with non-native speakers because of the broader allophonic variation, less canonical prosodic patterns, a higher rate of false starts and incomplete words, unusual word choice and smaller probability to have a grammatically well formed sentence. We present a comparative study of various approaches to speech recognition in non-native context. Comparing systems in terms of their accuracy and real-time factor we find that a Kaldi-based Deep Neural Network Acoustic Model (DNN-AM) system with online speaker adaptation by far outperforms other available methods.
منابع مشابه
Improving the speech recognition performance of beginners in spoken conversational interaction for language learning
The provision of automatic systems that can provide conversational practice for beginners would make a valuable addition to existing aids for foreign language teaching. To achieve this goal, the SCILL (Spoken Conversational Interaction for Language Learning) project is developing a spoken dialogue system that is capable of maintaining interactive dialogues with non-native students in the target...
متن کاملSpeed vs. Accuracy: Designing an Optimal ASR System for Spontaneous Non-Native Speech in a Real-Time Application
Automatic dialog interaction with remote interlocutors is a difficult application area for speech recognition technology because of the limited acoustic context, poor signal representation, high variability of spontaneous speech and limited time available to do the recognition of noncanonical spoken production. We present the speech recognition system for the non-native dialog applications that...
متن کاملUsing Task-Oriented Spoken Dialogue Systems for Language Learning: Potential, Practical Applications and Challenges
The technology developed for task-based spoken dialogue systems (SDS) has a significant potential for Computer-Assisted Language Learning. Based on the CMU Let’s Go SDS, we describe two areas in which we investigated adaptations of the technology to non-native speakers: speech recognition and correction prompt generation. Although difficulties remain, particularly towards robust understanding, ...
متن کاملDoes voice anthropomorphism affect lexical alignment in speech-based human-computer dialogue?
A common observation in dialogue research is that people tend to entrain, or align, linguistically with their interlocutors. This phenomenon offers a potentially important way to shape user behavior in human-computer dialogue interactions but little is known about the mechanisms that underlie it and how they may be affected by interlocutor design. We report a Wizard of Oz study that explored ho...
متن کاملSynchronizing Dialogue Contributions Characters in a Virtual Rea
Synchronizing user and system actions in a real-time virtual reality environment is a challenging task. Key components of a dialogue system like speech recognition, discourse processing, speech generation and synthesis all contribute significant delays to the response time. With human interlocutors, however, a continuous flow of conversation is important as any implausible gap may cause confusi...
متن کامل